Prediction model for future OHCAs based on geospatial and demographic data: An observational study

This study used demographic data in a novel prediction model to identify areas with high risk of out-of-hospital cardiac arrest (OHCA) in order to target prehospital preparedness. We combined data from the nationwide Danish Cardiac Arrest Registry with geographical- and demographic data on a hectare level. Hectares were classified in a hierarchy according to characteristics and pooled to square kilometers (km2). Historical OHCA incidence of each hectare group was supplemented with a predicted annual risk of at least 1 OHCA to ensure future applicability. We recorded 19,090 valid OHCAs during 2016 to 2019. The mean annual OHCA rate was highest in residential areas with no point of public interest and 100 to 1000 residents per hectare (9.7/year/km2) followed by pedestrian streets with multiple shops (5.8/year/km2), areas with no point of public interest and 50 to 100 residents (5.5/year/km2), and malls with a mean annual incidence per km2 of 4.6. Other high incidence areas were public transport stations, schools and areas without a point of public interest and 10 to 50 residents. These areas combined constitute 1496 km2 annually corresponding to 3.4% of the total area of Denmark and account for 65% of the OHCA incidence. Our prediction model confirms these areas to be of high risk and outperforms simple previous incidence in identifying future risk-sites. Two thirds of out-of-hospital cardiac arrests were identified in only 3.4% of the area of Denmark. This area was easily identified as having multiple residents or having airports, malls, pedestrian shopping streets or schools. This result has important implications for targeted intervention such as automatic defibrillators available to the public. Further, demographic information should be considered when implementing such interventions.


Introduction
[3][4] The acute nature of OCHA and the critical time-dependency are 2 reasons why the survival chances are low; for each minute without cardio-pulmonary resuscitation (CPR) and defibrillation, the survival drops by 7% to 10%. [5]he emergency medical services (EMS) has long recognized this, and the EMS response time has been a key parameter for benchmarking OHCA treatment for years.Nevertheless, it is difficult for any EMS system to provide timely response to all OHCAs.
Recognizing this as a potential limitation to improve survival, there has been a surge in activation of volunteer responders, especially in order to reach the bulk of OHCAs that happen in residential areas. [6]These networks typically consist of local volunteers who are alerted by the emergency medical dispatch in case of a suspected OHCA.Volunteers are then either directed towards a publicly available external automated defibrillator (AED) and then to the site, or guided straight to the site to perform CPR. [7]hile CPR greatly increases the chance of survival after OHCA, a recent study found that with CPR alone, the chance of survival remains low if no further help arrives within 10 to 13 minutes. [8]High-quality CPR, however, in combination with the use of an AED increases survival dramatically. [9,10]e it EMS, volunteer responder or publicly available AEDs, the key to a swift response is proximity.Proximity can be achieved either by increasing the number of responders, professionals as well as volunteers or increasing AED availability.Without proper prediction of future risk areas, proximity is either immensely resource heavy or unobtainable.
[13][14][15][16][17][18][19] When trying to improve future effort, using retrospective geographical incidence is straightforward, but also a fragile approach.This paper combines data on previous incidence with a prediction model including socioeconomic factors, age, ethnicity and education on the most granular level possible to create a novel approach to target future areatargeted initiatives to improve OHCA preparedness in Denmark, and evaluates this approach.

Setting
Denmark provides an entirely tax-financed health care system, and thus the availability of services, including the EMS is free of charge, providing a formal equality of availability to all inhabitants of all socioeconomic layers.

Data collection
Every call that EMS dispatch suspects is an OHCA in Denmark triggers a response, and if any kind of resuscitative effort is initiated from either bystander, first-responder or EMS, the EMS personnel is obliged to fill out a form to the Danish Cardiac Arrest Registry.This is also the case, if the EMS dispatch did not initially suspect it to be an OHCA.
From this registry, we collected a Global Positioning System (GPS) coordinate of ambulance halt on site of the OHCA.Arrests without valid GPS were excluded.Data on population size, mean age and education-level for each hectare of Denmark were obtained through a private geodata company, relying on data from Statistics Denmark. [20,21]From the same source, data regarding points of public interests in the form of airports, malls, schools, pedestrian streets w/multiple shops, public transport stations and major roads in the hectare was obtained.OHCAs within the same hectare were treated as multiple observations and merged with population data.

Permissions
This study complies with the Declaration of Helsinki and did not include human subjects.Data on previous OHCAs contained only GPS location and year, and demographic data was provided on a hectare level with no microdata enabling identification of individuals.The study was approved by the Danish Data Protection Agency.The Danish National Committee on Health Research Ethics does not require ethical approval for registry-based studies.The use of the Danish Cardiac Arrest Registry used for the conduct of this study was approved in the North Denmark Region (2008-58-0028).

Predictor variables
Alongside age, immigration status and educational level of the residents of the hectare, data available on points of public interest hypothesized to influence persons in the hectare, and thereby risk of an OHCA, were included as predictor variables.As such, points of public interest included hubs of transportation, both public transportation and private motoring.Furthermore, available data on public population centers exemplified by schools, malls and pedestrian streets with multiple shops were included as hypothesized areas with increased person-flow.
Each hectare was categorized in an area group according to characteristics that is either known or hypothesized to influence the OHCA incidence (Table 1).A single hectare could belong to different area groups over time and a hierarchical ordering was used to categorize the hectare of Denmark according to the first match in Table 1 starting with the first row.The hierarchy of points of public interest is shown in Table 1.Points of public interest were defined in 2019 and applied to previous years.If a hectare did not contain any point of public interest, it was categorized according to the number of residents and the workforce.Residents were defined as the number of people with a registered residential address in the hectare, stratified into a priori defined stratae with assumed external validity.Workforce was defined as the number of people employed on an address within the hectare.
For prediction, we aggregated the following characteristics of the population in each hectare: the proportion, as a number from 0 to 1, of the population aged 60 years or older, the proportion of non-western immigrants, and the proportion of the population with low education levels.Non-western immigrants included descendants of non-western immigrants and unknown origin, whereas lower educational level was defined as highest completed educational level being grade school, high school, vocational training or unknown educational level.Age, immigration status and educational level were categorical variables measured as a fraction of the residents in each category.Risk in airports was confined to the parts of the airport estimated to be buildings and entrances, whereas schools, which are also a mix of buildings and open areas, were included in their entirety, because open areas generally constitute a smaller fraction in comparison to airports.Hectares with no point of public interest, no residents and no workforce were analyzed separately.All predictor variables are listed in Table 1.

Statistical methods
We calculated the previous OHCA incidence rate according to the categories shown in Table 1 for each calendar year between 2016 and 2019, and calculated the mean OHCA incidence rate per km 2 in each category across the calendar period.
We predicted the risk of at least 1 OHCA per year per hectare based on the data from 2016 to 2018 using the predictor variables and models shown in Table 1.The hectare-specific risk of at least 1 OHCA per year was predicted by using a random forest model in hectares with a point of public interest except for Mall and Airport, where no model was used, and the predicted risk was simply the average incidence in the years 2016 to 2018.The random forest model bootstrapts from the original dataset, makes a defined number of decision trees, as per Table 1, and aggregates the results.The approach was chosen under the assumption that they were special areas, uninfluenced by other categories in the hierarchy.Logistic regression models were used in hectares without a point of public interest (Table 1) where the number of covariates was less suited for random forest.For all random forest models, the number of trees used was 500, except for hectares with a major road, where for computational reasons we used only 40 trees.The logistic regression models included all variables shown in Table 1 and all possible interactions.All methods and covariates were chosen a priori by the researchers.
Using data from 2019 we compared the predicted risks of each hectare with the binary outcome (1 = at least 1 OHCA in 2019, 0 = no OHCA in 2019) using the mean squared error, known as the Brier score. [22]To assess the predictive performance of our model we used a benchmark model which ignored the predictor variables and predicted the probability of at least 1 OHCA per year for each hectare by the average yearly incidence in that hectare in the training data (2016-2018).The benchmark model would correspond to assuming that areas of future incidence are the same as areas of previous incidence.The interpretation is that a lower Brier score in comparison is the superior statistical prediction model.
Predicted risk is calculated per hectare to ensure granularity for national, clinical utilization, and multiplied to report risk per km 2 to facilitate external interpretation.This means that maximum risk is 100 instead of 1, and corresponds to 100% chance of an event.When presenting risk per km 2 each km 2 is comprised of 100 hectares which may not be adjacent.
All data from 2016 to 2019 were used to fit our final model from which we report the predicted risks of at least 1 OHCA per year per hectare.25][26][27][28][29][30][31] Geographical representation was produced using QGIS. [32] Results and discussion

Previous incidence
In the period 2016 to 2019, Denmark was divided into 4,345,831 hectares.Table 2 shows the distribution of area group and the incidence of OHCAs according to the hierarchical categorization (Table 1).Between 2016 and 2019 there were a total of 19,090 OHCAs with a valid GPS coordinate.1240 OHCAs were excluded because of missing GPS data.Table 2 shows the percentage of total area and incidence, respectively, according to area type sorted by predicted risk.The red bars, representing the areas of highest predicted risk per km 2 and the cyan bars, representing the areas of lowest predicted risk per km 2 respectively, are accumulated in the pie charts.As is evident from Table 2, combining malls, public transport stations, pedestrian streets with multiple shops and areas with no point(s) of public interest but 10 to 50, 50 to 100, and 100 to 1000 residents respectively to account for 64.85% of OHCAs between 2016 and 2019 and only 3.4% of the area, corresponding to 1476 km 2 .

Predictive performance
We evaluated our model against the benchmark model, and the results are depicted in Table 3.As evident from this table, our prediction model performs better than the benchmark model in every area type.

Predicted incidence
The Predicted annual risks are depicted in Table 3.
The individual predicted annual risk of one or more OHCA for each hectare was stratified into 2 groups, 0 to 2 both excluded, and ≥2 and plotted on a map of Denmark as shown in Figure 1.

Discussion
This study identifies previous OHCA incidence as being concentrated to geographically small areas, and demonstrates that a prediction model including population characteristics is superior when identifying areas of future incidence.
Previous OHCA incidence is consolidated in selected points of public interest and areas with a high resident population density.While the 95% confidence intervals for risk per km 2 are broad, it is clear from Table 3 that the prediction model is better at identifying areas where the next OHCA might happen than by simply looking at areas of previous arrests.When looking at the geography in the prespecified categories, the previous incidence and the prediction model align in determining high-incidence categories.The prediction model provides an individual risk for each hectare, of which the mean is presented above.While both previous incidence and predicted risk is high in relation to areas covered by certain points of public interest, the vast majority of OHCAs happen in the, geographically larger, areas of dense population but without points of public interest such as malls or pedestrian streets with multiple shops.
In relation to previous studies of geo-spatial analyses of OHCA, our data benefits from having one national registry for all OHCAs and a very high level of capture and quality of OHCA data.Obtaining data on geographical and population parameters on a national scale is rare, but ultilizing such data is not without caveats.Differences in data availability across countries may limit the external value of present studies, as the available data points chosen in this study is defined by a national entity, and might be differently defined, or altogether unobtainable in other countries.Other, internationally acknowledged standards of geographical division, such as the "Degree of Urbanisation" (DEGRUBA) would allow for more external validity of present study, but would have decreased granularity tremendously, and would be too inaccurate to utilize in terms of AED placement, as coverage of such are few hundred meters. [33]urther, only selected parts of the Danish geography is covered by the DEGRUBA division.Previous attempts to analyze incidence and outcome of OHCA incidence have largely focused on the characteristics of the individual suffering an OHCA, rather than the geographical characteristics of the site; thus making it less suitable for geographical analysis and prediction.A previous study of geographical data has shown the importance of age and socioeconomic status, but did not consider population density which, in the current study, turns out to be of pivotal importance. [34]Another study on geography of OHCA using a methodology similar to our but with a more modest amount of includes arrests, only used age and sex as demographic predictors, and made predictions on a coarser, municipality-level scale. [35]38][39][40] While we have valid data on the OHCAs included, the EMS  response time is subject to great changes from incident to incident depending on where the nearest unit is at any given time.
As the amount of rural areas with some risk of OHCA is so immense, relying only on EMS response in all of these areas will be unrealistic, this is already part of the motive for implementing a widespread system of professional-and volunteer responders. [41]hile according to this study, most points of public interest have a slightly increased risk, the risk is particularly high in malls, pedestrian streets with multiple shops, and public transport stations.Even after adjustment for areas with people, the predicted risk in airports remained low when compared to other points of public interest.Knowing that Denmark's largest airport, has a yearly incidence over 7 cardiac arrests, this airport alone is considered to account for the vast majority of total cardiac arrests at airports, and given that military airports were included due to data limitations, the actual risk of specific airports might be underestimated. [42]Preparedness in airports is known to be very good in terms of CPR rate and AED accessibility. [42]hen planning future efforts, it is important to bear in mind that the majority of the incidence is located in residential areas.This is a field of great opportunity for improvement, as only 3.8% to 6.4% of the OHCAs occurring in private residences in the period were defibrillated prior to EMS arrival. [2]It has previously been concluded that focusing AED placement on public places rather than residential areas would be more beneficial due a relatively higher frequency of shockable rhythm in witnessed OHCAs. [43]It is, however, uncertain whether this finding is due to delay in AED application in residential areas, and the results in this paper points towards AED placement in selected residential areas with a dense population as a very important supplement to areas of special activities as above.Previous studies have proposed a coverage radius of a publicly available AED to a quarter mile (~400 m). [44]This corresponds to an area of around 0.5 km 2 .If applied to our findings, that almost 65% of OHCAs occur within an area of 1476 km 2 a total of only 2952 well-placed AEDs would theoretically provide coverage of the areas of highest predicted risk.At present there are well over 20,000 publicly available AEDs in Denmark. [45]When instead considering each hectare separately, as per Figure 1, only 55,116 hectares corresponding to 511 km 2 have predicted annual risk above the threshold of 2 per km 2 .Future research and intervention should ensure that, at the very least, these areas are covered by an AED.
These findings are very specific to the Danish geography, however, the importance of the selected area types, and the notion to include both points of public interest and demographic data when making geographic predictions of future OHCAs are universal.

Limitations
Firstly, no matter how good a prediction model is, it comes with inaccuracy inherent in all attempts to predict the future.Therefore, this study does not claim to produce certainties, but makes a probable guess on possible future occurrences.
Secondly, only OHCAs where a resuscitative effort is initiated are recorded, and arrests in very remote areas may not be discovered until resuscitation is futile, leading to a possible underestimation of incidence in rural areas.These arrests, however, would not constitute an area of potential improvement, as dispatch of AEDs, volunteer responder programs and EMS relies on timely Acknowledgments of OHCA.
Third, the study is, to an extend, defined by data availability, which both limited the amount of freedom in study design and might limit the potential for external reproduction.

Conclusion
The great majority of OHCAs in Denmark occurred in a very small geographical area of Denmark.Most occurred in residential areas with a high density of residents, followed by public streets with commerce.Prediction models including demographic data seems to be efficient for identification of high risk areas and targeted intervention.

Table 1
Variables and models used for prediction.
Major road Number of residents and workforce, age 60+, immigration from non-western country and lower educational level Random forest (num.trees = 40) No PoPI, 100 to 1000 residents Workforce, age 60+, immigration from non-western country and lower educational level Logistic regression No PoPI, 50 to 100 residents Workforce, age 60+, immigration from non-western country and lower educational level Logistic regression No PoPI, 10 to 50 residents Workforce, age 60+, immigration from non-western country and lower educational level Logistic regression No PoPI, 0 to 10 residents Workforce, age 60+, immigration from non-western country and lower educational level Logistic regression No PoPI, only workforce Workforce Logistic regression PoPI = point of public interest.

Table 2
Area and previous incidence according to area type and year.

Table 3
Predictive performance